A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

نویسندگان

  • Niharika Gauraha
  • Swapan K. Parui
چکیده

We consider the problem of model selection and estimation in sparse high dimensional linear regression models with strongly correlated variables. First, we study the theoretical properties of the dual Lasso solution, and we show that joint consideration of the Lasso primal and its dual solutions are useful for selecting correlated active variables. Second, we argue that correlation among active predictors is not problematic, and we derive a new weaker condition on the design matrix, called Pseudo Irrepresentable Condition (PIC). Third, we present a new variable selection procedure, Dual Lasso Selector, and we show that PIC is a necessary and sufficient condition for consistent variable selection for the proposed method. Finally, by combining the dual Lasso selector further with the Ridge estimation even better prediction performance is achieved. We call the combination, DLSelect+Ridge. We illustrate the DLSelect+Ridge method and compare it with popular existing methods in terms of variable selection and prediction accuracy by considering a real dataset.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

Predicting the Potential Habitat Distribution of Crataegus Pontica C. Koch, Using a Combined Modeling Approach in Lorestan Province

Habitat degradation is one the important reasons of plant species extinction. Modeling techniques are widely used for identifying the potential habitats of different plant species. Thus, the purpose of current study was to determine potential habitats of Zalzalak in Lorestan Province. Species presence data and 23 environmental variables were collected in Lorestan Province. Correlation analysis ...

متن کامل

Comparison of Ordinal Response Modeling Methods like Decision Trees, Ordinal Forest and L1 Penalized Continuation Ratio Regression in High Dimensional Data

Background: Response variables in most medical and health-related research have an ordinal nature. Conventional modeling methods assume predictor variables to be independent, and consider a large number of samples (n) compared to the number of covariates (p). Therefore, it is not possible to use conventional models for high dimensional genetic data in which p > n. The present study compared th...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

کارایی روش های مختلف آنالیز آماری در تخمین مؤلفه های آب نمود واحد مصنوعی آبخیزهای شمال کشور

  The present research aimed to compare different methods of statistical analysis and to select the best method for achievement to the model among components of synthetic unit hydrograph by using of the physical characteristics of catchments, in northern catchments of Iran, with the area of 177000 km2 in Giulan, Mazandaran and Golestan Provinces. For execution of the research, 9 physical charac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017